Skip to content

Conversation

@d-v-b
Copy link
Contributor

@d-v-b d-v-b commented Oct 30, 2025

since #3554 was an unpopular direction I'm going instead with codspeed + pytest-benchmark. Opening as a draft because I haven't looked into how codspeed works at all, but I'd like people to weigh in on whether these initial benchmarks make sense. Naturally we can add more specific ones later, but I figured just some bulk array read / write workloads would be a good start.

@github-actions github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Oct 30, 2025
@github-actions github-actions bot removed the needs release notes Automatically applied to PRs which haven't added release notes label Oct 30, 2025
@d-v-b d-v-b marked this pull request as ready for review October 30, 2025 13:46
@d-v-b
Copy link
Contributor Author

d-v-b commented Oct 30, 2025

@zarr-developers/steering-council I don't have permission to register this repo with codspeed. I submitted a request to register it, could someone approve it?

@d-v-b d-v-b requested review from dcherian and jhamman and removed request for jhamman October 30, 2025 14:37
@normanrz
Copy link
Member

@zarr-developers/steering-council I don't have permission to register this repo with codspeed. I submitted a request to register it, could someone approve it?

done

@d-v-b
Copy link
Contributor Author

d-v-b commented Oct 30, 2025

does anyone have opinions about benchmarks? feel free to suggest something concrete. Otherwise, I think we should take this as-is and deal with later benchmarks (like partial shard read / writes) in a subsequent pr

@codspeed-hq
Copy link

codspeed-hq bot commented Oct 30, 2025

CodSpeed Performance Report

Congrats! CodSpeed is installed 🎉

🆕 30 new benchmarks were detected.

You will start to see performance impacts in the reports once the benchmarks are run from your default branch.

Detected benchmarks


ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

uses: CodSpeedHQ/action@v4
with:
mode: instrumentation
run: hatch run test.py3.11-1.26-minimal:run-benchmark
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we test the latest instead? seems more appropriate...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest version of python? What's the reasoning? I'd rather update this file when we drop a supported version vs when a new version of python comes out.

Copy link
Contributor

@dcherian dcherian Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we'd want to catch a perf regression from upstream changes too? I'm suggested latest version of released libraries py=3.13, np=2.2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't have an upper bound on numpy versions, so I don't think this particular workflow will help us catch regressions from upstream changes -- we would need to update this workflow every time a new version of numpy is released. IMO that's something we should do in a separate benchmark workflow. This workflow here will run on every PR, and in that case the oldest version of numpy we support seems better.

we also don't have to use a pre-baked hatch environment here, we could define a dependency set specific to benchmarking. but my feeling is that benchmarking against older versions of stuff gives us a better measure of what users will actually experience.

@dcherian
Copy link
Contributor

feel free to suggest something concrete

indexing please. that'll exercise the codec pipeline too.

a peakmem metric would be good to track also, if possible.

@d-v-b
Copy link
Contributor Author

d-v-b commented Oct 31, 2025

feel free to suggest something concrete

indexing please. that'll exercise the codec pipeline too.

a peakmem metric would be good to track also, if possible.

I don't think codspeed or pytest-benchmark do memory profiling. we would need https://pytest-memray.readthedocs.io/en/latest/ or something equivalent for that.

and an indexing benchmark sounds like a great idea but I don't think I have the bandwidth for it in this pr right now

@d-v-b
Copy link
Contributor Author

d-v-b commented Nov 3, 2025

I added a benchmark that clearly reveals the performance improvement of #3561

@d-v-b
Copy link
Contributor Author

d-v-b commented Nov 3, 2025

I added some slice-based benchmarks based on the examples from #3524, and I updated the contributing docs with a section about the benchmarks. assuming we can resolve the discussion about which python / numpy version to use in the CI job, I think this is ready

@d-v-b
Copy link
Contributor Author

d-v-b commented Nov 3, 2025

new problem: the codspeed CI benchmarks are way too slow! the benchmark suite runs in 90s locally, and It's taking over 40m to run in CI. Help would be appreciated in speeding this up.

@d-v-b
Copy link
Contributor Author

d-v-b commented Nov 3, 2025

owing to the large number of syscalls in our benchmark code, codspeed recommended using the walltime instrument instead of their virtual CPU instrument. But to turn on the walltime benchmark, we would need to run our benchmarking code on codspeed's servers, which is a security risk.

Given that codspeed is not turning out to be particularly simple, I am inclined to defer the codspeed CI stuff for later work. But if someone can help get the test runtime down, and / or we are OK running our benchmarks on codspeed's servers, then maybe we can get that sorted in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants